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1  Introduction 

Let  $  be  a  normed  space  of  functions  and  let  A  be  a  subset  of  The 
prototypical  problem  in  approximation  theory  consists  in  approximating  an 
element  /  of  $  by  an  element  of  A,  that  is  looking  for  an  element  in  A  that 
has  minimum  distance  from  /.  It  is  also  natural  to  consider  the  distance  of 
f  from  A  as 

=  inf  ||/-a||  (1) 

and  to  study  this  quantity  for  different  choices  of  A  and  /  €  ^-  In  the  classical 
theory  of  approximation  the  set  A  is  usually  a  linear  fc-dimensional  subspace 
Ajt  C  $  (Lorentz,  1986)  (the  algebraic  or  the  trigonometric  polynomials  of 
given  degree  and  the  splines  with  fixed  knots  are  typical  examples  of  such 
subspaces),  while  in  nonlinear  approximation  theory  the  linear  subspace  Ak 
is  replaced  by  a  A:-dimensional  manifold  Mk  (DeVore,  1991).  Usually  one  has 
a  family  of  manifolds  {Mk}kLi  such  that  U/t  Mk  is  dense  in  $  and 

M  1  C  Mj  C  . . .  C  Mn  C  . . . 

so  that  the  quantity  6{f,  Mk)  is  a  monotone  decreasing  function  of  k  converg¬ 
ing  to  zero  and  the  approximation  in  Mk  gets  arbitrarily  close  to  /  provided 
one  takes  k  sufficiently  large.  However,  since  the  computational  time  needed 
to  find  an  approximation  to  /  in  Mk  is  going  to  increase  with  k,  it  is  of  great 
interest  to  know  the  rate  of  convergence  to  zero  of  6{f,Mk)  as  a  function 
of  k.  This  rate  of  convergence  can  be  taken  as  a  measure  of  the  complexity 
of  /  with  respect  to  the  manifolds  Mjk,  in  the  sense  that  “simple”  functions 
should  have  a  fast  rate  of  convergence. 

As  an  example,  let  us  consider  as  space  $  the  space  of  the  functions 
whose  partial  derivatives  of  order  s  are  bounded  in  the  uniform  norm  on 
the  d-dimensional  cube  I  =  [0,1]**  and  satisfy  a  Lipschiz  condition  with 
exponent  a  (Lorentz,  1986,  p.  50).  On  the  space  $  we  consider  the  uniform 
norm  ||/||  =  max/  |/(x)|.  Choosing  as  manifold  Mk  the  set  of  polynomials 
of  degree  n  —  1  in  each  of  the  d  variables,  that  is  a  linear  space  of  dimension 
k  =  n*^,  the  following  bound  can  be  obtained  (Lorentz,  1986): 

S(f,M,)<Ndk-‘^  (2) 

where  N  is  a  constant  that  depends  on  /  and  s. 

From  this  example  we  see  that  the  rate  of  convergence  dramatically 
slows  down  when  the  dimension  d  increases,  revealing  the  discouraging  phe¬ 
nomenon  known  under  the  name  of  “curse  of  dimensionality”  (Bellman, 
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1961).  However,  for  every  fixed ,  number  of  dimensions,  arbitrary  inverse- 
power  rates  of  convergence  can^  be  obtained  if  the  smoothness  index  s  is 
chosen  big  enough.  This  result  is  typical  in  linear  approximation  theory 
since  the  computation  of  the  n-width  of  the  space  shows  that  the  best 
linear  technique  cannot  improve  the  rate  of  convergence  0{k~^)  (Lorentz 
1986,  p.  135). 

Similar  results,  in  both  linear  and  nonlinear  approximation  theory  (De- 
Vore,  1991),  hold  for  other  spaces  of  functions  in  which  smoothness  is  mea¬ 
sured  in  a  different  way.  We  are  therefore  led  to  argue  that  in  practical 
situations  we  can  only  approximate  functions  whose  smoothness  increases 
with  the  dimension.  As  an  example-we  consider  again  the  spaces  for 
s  =  d.  It  is  clear  from  eq.  (2)  that  in  this  case  the  rate  of  convergence  of 
polynomial  approximation  to  an  /  €  A^^  is  0{k~^)  and  it  is  in  this  sense 
“independent  on  dimensionality”. 

In  a  recent  paper  (1990)  Jones  showed  how  to  construct  a  sequence  of 
functions  /„  that  approximate  certain  functions  in  a  Hilbert  space  with  a 
rate  of  convergence  O(^).  A  statement  of  Jones’  lemma  is  given  in  section 
2.  ;  application  of  this  result  to  projection  pursuit  regression  and  neural 
networks  has  already  been  presented  in  (Jones  1990;  Barron  1991),  where 
appropriate  approximation  schemes  and  spaces  of  functions  in  are 
described  in  which  the  complexity  of  approximation  increases  mildly  with 
d.  It  is  worthwhile  to  observe  that  this  is  obtained  at  the  expense  that  the 
functions  contained  in  are  more  and  more  “regular”  when  d  increases. 
Moreover,  it  is  not  completely  clear  yet  how  computationally  expensive  the 
approximation  /„  may  be.  A  very  short  review  of  Jones’  and  Barron’s  results 
is  given  in  section  5. 

The  aim  of  this  paper  is  to  present  an  application  of  Jones’  lemma  to 
the  approximation  by  linear  combination  of  translates  of  a  given  function 
G.  In  particular  for  appropriate  choices  of  G  we  obtain  estimates  for  the 
rate  of  convergence  of  certain  Radial  Basis  Functions  schemes  (Micchelli, 
1986;  Powell,  1987;  Dyn,  1991;  Poggio  and  Girosi,  1990)  on  certain  spaces 
of  functions  of  Sobolev  type.  For  the  convenience  of  the  reader  we  collect  in 
the  appendix  a  few  known  results  about  Sobolev  spaces  and  integration  of 
Banach  valued  functions. 


2  The  Maurey-Jones-Barron  Lemma 

Our  result  is  based  on  a  lemma  by  Jones  (1990)  on  the  convergence  rate  of  an 
iterative  approximation  scheme  in  Hilbert  spaces.  A  formally  similar  lemma. 
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brought  to  our  attention  by  R.  Dudley  (Dudley,  1991),  is  due  to  Maurey, 
and  was  published  by  Pisier  in  1981.  However  Jones’  lemma  is  constructive 
while  Maurey 's  is  not.  Here  we  report  a  version  of  the  lemma  due  to  Barron 
(Barron  1991)  that  contains  a  slight  refinement  of  Jones’  result: 

Lemma  2.1  (Maurey-Jones-Barron)  If  f  is  in  the  closure  of  the  convex 
hull  of  a  set  Q  in  a  Hilbert  space  H  with  ||^||  <  b  for  each  g  E  Q,  then  for 
every  n  >  1  and  for  c  >  b^  —  |(/||^  there  is  a  fn  in  the  convex  hull  of  n  points 
in  Q  such  that 


11/  -  uf  < 


c 

n 


The  interesting  feature  of  this  lemma  is  that  the  sequence  {/n}^o 
following  structure: 


/n+l  =  Otnfn  +  (1  ~  Q:n)5n  (3) 

where  a„  and  are  chosen  in  order  to  “approximately  solve”  the  following 
minimization  problem: 


inf  ^  -  11/  -  Olnfn  -  (1  -  an)^T.|| 

where  by  “approximately  solve”  we  mean  that  it  is  sufficient  at  each  step  to 
reach  a  distance  from  the  infimum  of  order  O(^).  The  lemma  is  therefore 
constructive,  providing  a  procedure  that  can  achieve  the  prescribed  rate. 

In  order  to  exploit  this  result  we  need  to  define  suitable  classes  of  functions 
which  are  the  closure  of  the  convex  hull  of  some  subset  ^  of  a  Hilbert  space 
H.  We  are  therefore  naturally  led  to  study  functions  that  can  be  represented 
as  “infinite”  convex  combinations  of  the  type 

OO  oo 

/  =  >  0  ,  ^  ^  or,  =  1  .  (4) 

i=l  •=! 

One  way  to  approach  the  problem  consists  in  utilizing  the  integral  represen¬ 
tation  of  functions.  Suppose  that  the  functions  in  a  Hilbert  space  H  can  be 
represented  by  the  integral 

/(x)  =  [  Gt{x)da{t)  (5) 

where  da.  is  some  measure  on  the  parameter  set  M..  If  da  is  a  finite  measure, 
the  integral  (5)  can  be  seen  as  an  infinite  convex  combination  of  the  type  of 
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eq.  (4),  and  therefore  the  function  /  belongs  to  the  closure  of  the  convex 
hull  of  some  subset  of  //.  In  the  next  section  we  formalize  this  idea  in  the 
special  case  in  which  the  functions  Gt(x)  are  the  translates  G{x  —  t)  of  a 
fixed  function  G  and  we  show  how  it  leads  to  define  approximation  techniques 
whose  rate  of  convergence  in  appropriate  spaces  of  functions  is  O(^). 


3  Approximation  by  Translates  of  a  Func¬ 
tion  G 

Let  G  be  a  fixed  function  belonging  to  =  L2.  We  define  the  space  Lq 

as  the  set  of  the  functions  of  the  form 


f  =  G*\  (6) 

where  A  is  any  signed  Radon  measure  whose  total  variation  |A|;{d  =  ||A||  is 
finite  and  the  symbol  *  stands  for  the  convolution  operation.  Assuming  from 
now  on  that  \\G\\i^  =  1,  the  following  inequality  holds  (Stein  and  Weiss, 
1971) 


II/IIl,  <  ||A|| 

showing  the  inclusion  Lq  C  L2.  It  is  natural  to  approximate  elements  of  Lq 
by  elements  of  the  set 


G„  =  {/  €  L2  I  /  =  .  A.  e  /? ,  t,  €  R"}  ,  (7) 

«=i 

where  we  indicate  by  Gt  the  function  G  translated  by  the  vector  t,  that  is 
Gt(x)  =  G(x  —  t).  Using  lemma  2.1  we  can  now  prove  the  following 

Theorem  3.1  Let  f  be  a  function  in  La,  so  that  f  =  G*  X,  where  G  €  L2, 
lIGIlij  =  1,  and  X  is  a  Radon  signed  measure  of  bounded  total  variation  ||A||. 
Then  f  belongs  to  the  L2-closure  of  the  convex  hull  of  the  set 

A  =  {sG,  I  t  €  |3|  <  ||A||} 

and  there  exist  n  coefficients  Cq  and  n  vectors  to  such  that: 

ii/-f;c,G(x-tj|ii,<^ 

0=1  " 

/ora//c>||A|l^-||/lli,. 
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Proof:  We  consider  the  vector-valued  function 

T  -.1^ 

such  that 


T(t)  =  Gt  . 

The  function  T  is  continuous,  hence  A-measurable,  moreover  one  has 

l|r(t)lk.<i|A|(t)  =  IIGIIt,  f/\m)  =  l|A||  <  +00  . 

Therefore  it  exists  the  Bochner  integral  of  T  with  respect  to  A  (see  ap¬ 
pendix  A): 

rj  =  r  Tit)dX{t)  , 

JR:* 

and  by  lemma  (A.2)  we  have 


T)  €  CO  A  (8) 

where  A  =  {sGt  |  t  €  |s|  <  ||A||},  co  A  stands  for  the  convex  hull  of 

the  set  A  and  the  bar  stands  for  the  closure  in  L2.  Now  we  shall  prove  that 
T)  =  f.  This  can  be  done  by  proving  that 

F-f  =  F-rj  ,  VF-  €  {L^r  (9) 

where  (Ij)*  is  the  dual  space  of  L2,  that  is  L2  itself.  From  the  properties  of 
the  Bochner  integral  we  have: 

f,  =  r  T(t)dHt)  =  f  (rctWm  ■ 

JR’*  JR* 

Taking  this  into  account,  the  identity  (9)  can  be  written  as: 
f  dx  <l>(x)  f  Gix-t)dX{t)  =  /  dA(t)  /  dx  <^(x)G(x-t)  ,  e  I2  . 

JR’*  JR*  JR*  JR* 

Now  by  Fubini’s  theorem  the  two  sides  of  this  last  equation  are  equal,  and 
therefore  rj  =  f. 

By  eq.  (8)  /  =  7;  belongs  to  the  L2  closure  of  the  convex  hull  of  the  set  A, 
which  is  contained  in  the  ball  of  radius  ||A||.  By  the  Maurey-Jones-Barron 
lemma  we  can  find  a  set  of  n  coefficients  Cg,  and  n  vectors  such  that: 
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11/ -  f;c,G(x  -  ^ 

for  all  c  >  C{f)  =  ||A|p  -  \\f\\l.  □ 

In  theorem  (3.1)  the  approximation  error  is  measured  in  the  L2  norm.  Im¬ 
posing  some  restrictions  on  the  function  G  a  similar  estimate  can  be  obtained 
for  other  norms,  and  in  particular  for  the  Loa  norm.  In  fact,  suppose  that 
G  €  where  is  the  Sobolev  space  of  the  functions  whose 

weak  derivatives  up  to  order  s  are  in  L2  (see  Appendix  B).  Then  one  can 
easily  see  that  theorem  (3.1)  can  be 'formulated  in  the  Hilbert  space 
instead  of  L2' 

Theorem  3.2  Let  f  be  a  function  such  that  f  =  G  *  X,  where  G  6  , 

IIGII//*.^  =  1.  and  X  is  a  Radon  signed  measure  of  bounded  total  variation 
||A||.  Then  f  belongs  to  the  -closure  of  the  convex  hull  of  the  set 

/I  =  {sG,  I  t  €  If,  |s|  <  ||A||} 

and  there  exist  n  coefficients  Ca  and  n  vectors  such  that: 

/or  all  c>||Af-||/||J,..,. 

We  notice  that  if  the  condition  s  >  ^  holds,  then  the  Sobolev  embedding 
theorem  (see  Appendix  B)  guarantees  that  C  C®  and  that  it  exists 
Cl  >  0  such  that 


l|•|(«><Cl||.|U..,  . 

Therefore  the  approximating  sequence  {/„}  converges  uniformly,  and  the 
following  corollary  holds: 

Corollary  3.1  Under  the  conditions  of  theorem  (3.2),  i/s  >  |  there  exists 
n  coefficients  Ca,  n  vectors  to  and  a  constant  Ci  such  that: 

11/  -  ~  ^ 

~  n 

/ora//c>||A|p-||/||?,.,,. 
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From  a  practical  point  of  view,  in  many  cases,  what  it  is  really  interesting 
is  an  estimate  of  the  error  in  the  sup  norm,  instead  of  the  L2  or  norm. 
Think  for  example  of  the  problem  of  approximating  the  trajectory  of  a  robot 
arm:  it  is  clear  that  what  is  really  needed  in  this  case  is  a  small  L^o  norm  of 
the  difference  between  the  desired  and  the  approximated  trajectory,  while  a 
small  Z,2  norm  is  of  little  interest. 

Remark:  we  notice  that  the  elements  of  the  set  defined  by  eq.  (7)  can 
also  be  seen  as  points  of  a  manifold  Mk  whose  dimension  is  k  =  n{d  +  1). 
Therefore  theorem  (3.1)  can  also  be  formulated  in  terms  of  the  number  of 
parameters  k  that  are  needed  to  achieve  a  certain  error,  saying  that  d  f  E  Lg 
then 


If  we  compare  this  result  with  the  typical  estimates  (DeVore,  1991),  we 
notice  that  in  this  case  the  way  the  dimension  affects  the  convergence  curve 
is  much  less  dramatic,  corresponding  to  a  simple  scale  dilation.  This  means 
that  in  some  sense  the  complexity  of  the  space  Lq  does  not  increase  very 
much  when  the  dimension  increases.  It  is  interesting  to  characterize,  for 
several  specific  choices  of  G,  the  structure  of  Lq  and  to  understand  whether 
it  contains  a  '‘sufficiently  large”  set  of  functions,  where  by  “sufficiently  lc.rge” 
we  mean  large  enough  to  contain  functions  that  are  encountered  in  practical 
cases.  This  will  be  done  in  the  next  section  for  two  particular  choices  of  G. 


4  Examples  of  functions  G 

In  this  section  we  consider  two  choices  for  the  function  G  and  study  the 
corresponding  functions  spaces  Lq-  We  remind  that  for  any  given  G  €  L2{R’^) 
the  space  Lq  is  defined  as 

LG={f€  LiiR^)  1  /  =  G*  A  ,  A  e  M{R:^)} 

where  M{R'^)  =  M  is  the  space  of  Radon  signed  measures  of  bounded  total 
variation  on  R!^. 

4.1  The  Gaussian 

We  consider  the  Gaussian  function  G(x)  =  since  approximation  with 

Gaussian  basis  functions  is  often  used  in  practical  applications  (Moody  and 
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Darken,  1989;  Poggio  and  Girosi,  1990;  Poggio  and  Edelman,  1990;  Banner 
and  Slotine,  1991).  Clearly  G  €  L2{R^)i  so  that  the  space  Lg  is  well  defined 
in  any  dimension.  Due  to  the  smoothness  of  the  Gaussian  and  to  its  fast 
decay  property  this  space  of  functions  is  rather  small.  However  it  contains 
an  interesting  subset  of  the  space  of  band  limited  functions,  the  functions 
whose  Fourier  transform  has  compact  support.  In  particular,  let  us  define 
the  space  of  functions  Bk{R‘^): 

s  {/ I  /  €  ,  (10) 

that  is  the  set  of  functions  whose  Fourier  transform  has  compact  support  and 
k  continuous  derivatives.  Then  the  following  inclusion  holds: 

BkiR^)CLG,  VA:>^.  (11) 

In  fact  if  /  6  Bk{R!^)  then  we  have 

M  =  ae«*"7(s)  =  A  e  CUK‘)  . 

G(s) 

where  a  is  a  constant  depending  only  on  the  dimension  d.  Therefore  f  =  G*X 
where  A  is  the  Fourier  transform  of  the  function  \  =  ■^.  Since  the  following 
inclusion  holds  (see  appendix  B): 

c  A(Bf) ,  V*  >  ^  , 

where  A{R^)  is  the  space  of  the  functions  whose  Fourier  transform  belongs 
to  Z/i(/?‘^),  then  A  €  and  /  €  La- 

We  notice  that  the  Gaussian  function  and  its  derivatives  of  any  order 
belongs  to  L2i  and  therefore  G  €  for  any  s  >  0.  Hence  we  can  apply 
corollary  (3.1)  to  conclude  that  the  convergence  rate  O(^)  also  holds  for 
approximation  in  the  sup  norm. 


4.2  Bessel-Macdonald  Kernels 

We  now  consider  the  Bessel-Macdonald  kernels,  a  family  of  functions  Gm(x) 
defined  in  terms  of  their  Fourier  transforms: 


Gm(s) 


1 

(l-b47r2||s||2)? 


m  >  0  . 
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The  functions  Gm(x)  are  integrable  functions  that  decay  exponentially  at 
infinity  and  may  have  a  singularity  at  the  origin  (Stein,  1970,  p.  132).  How¬ 
ever  if  m  >  d  they  are  continuous  and  actually  differentiable  of  any  order 
q  <  m  —  d.  We  want  to  work  with  continuous  funtions  and  in  what  follows 
we  will  always  make  the  assumption  m  >  d.  Since  Gm(s)  is  positive  and 
radial,  we  also  have  that,  by  Bochner’s  theorem,  6^m(x)  is  positive  definite 
(Micchelli,  1986),  and  therefore  approximation  by  translates  of  Gm(x)  is  a 
Radial  Basis  Functions  approximation  scheme.  The  following  observations 
can  be  done  regarding  the  functions  Gm  and  the  space  ■ 

1.  One  has 

Gm  €  for  0  <  s  <  m  —  ^  . 

Since  we  have  made  the  assumption  m  >  d  one  can  take  s  such  that 
I  <  s  <  m  —  Then  we  can  apply  corollary  (3.1)  to  conclude  that 
the  rate  of  convergence  0{-^)  also  holds  for  approximation  in  the  sup 
norm. 

2.  Since  Li  C  M,  the  space  Lcm  contains  the  space  CmiR"^)  =  Cm  of 
those  functions  that  can  be  written  as  f  =  Gm  *  ^  with  X  £  Li.  For 
more  information  about  the  space  Cm,  which  is  a  special  instance  of 
the  so  called  potential  spaces,  the  reader  is  referred  to  (Stein,  1970). 
The  space  Cm  is  related  to  the  Sobolev  space  /f"*’*(/?^)  =  of  the 
functions  whose  weak  derivatives  up  to  order  m  are  in  Li  (see  Appendix 
B).  More  precisely  one  has  (Stein  1970,  p.  160): 

Hm.i  ^  £i^  ^  ^  even  . 

Therefore  we  conclude  that  if  m  >  </  and  m  is  even,  by  superposition  of 
translates  of  Gm  we  can  approximate  with  a  rate  of  convergence  O(^) 
all  the  functions  of  and  hence  all  C”*  functions  which  rapidly 

decrease  to  infinity. 

3.  Again  for  s  <  m  —  |  and  m  >  d,  m  even,  we  have  the  following 
characterization  of  the  space  Tom- 

Lo„  =  {f€H-'^  |(/.  A)?/€A<)  . 
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In  fact,  if  /  €  Lo„  that  is  /  =  with  \  E  M..  then  (/  — A)T /  =  A 

since  Gm  is  the  fundamental  solution  of  the  operator  (/  — A)~.  On  the 
other  hand,  if  /  €  and  (/  —  A)t/  =  A  6  M,  then  by  taking  the 
convolution  of  both  sides  with  G^  we  have  /  =  Gm  * 

5  Other  Approximation  Schemes 

Other  choices  of  integral  representation  lead  to  different  approximation  schemes 
and  different  spaces  of  functions  that  can  be  approximated  with  a  similar  con¬ 
vergence  rate.  For  example,  using  the  Fourier  representation  of  a  function 
(if  it  exists)  we  have; 

/(x)  =  /  ds  cos(s-x  +  ^(s))|/(s)|  (12) 

Jr<‘ 

where  0{s)  is  the  phase  of  the  Fourier  transform  /(s)  of  /.  Jones  (1990) 
considers  the  space  A{R'^)  (appendix  B)  of  the  functions  such  that  their 
Fourier  transform  is  in  LiiR^)  and  shows  that  they  can  be  approximated  by 
functions  of  the  form 


/n(x)  =  ^  A,  cos(t,-  •  X  +  Oi)  (13) 

1=1 

with  the  rate  of  convergence  O(^). 

Another  result  of  this  type  h<is  been  proved  by  Barron  (1991).  He  con¬ 
siders  the  set  of  the  functions  such  that 

/  .  <^s  |ls|ll/(s)|  < +00  (14) 

JR'* 

that  is  the  functions  whose  gradient  is  in  A{R^),  and  approximates  elements 
of  this  set  by  functions  of  the  form 

n 

/n(x)  =  l]A,<T(t,  -X  +  ^i)  , 
i=l 

where  a{-)  is  any  sigmoidal  function.  Condition  eq.  (14)  can  be  rewritten  as 


1|.>III/(»)I  e  i.(ir'). 


Denoting  by  the  function 


Id(^) 


1 

llxll-'-i 


(15) 
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and  noticing  that  its  Fourier  transform  is  /^(s)  =  ||s||~'  we  can  also  say  that 
the  space  of  function  that  satisfy  condition  eq.  (14)  is  the  space  of  function 
that  can  be  written  as 


f  =  XeAiR^).  (16) 

There  is  a  remarkable  analogy  between  this  set  of  function  and  the  func¬ 
tion  space  considered  in  section  (4.2),  that  is  the  set  of  functions  such 
that: 


/  =  Gm  *  X  ,  X  6.  Ti(i?‘^)  ,  m  >  d  .  (17) 

In  eq.  (16),  the  function  Id  goes  to  zero  faster  and  faster  as  d  increases, 
while  its  Fourier  transform  remains  unchanged.  In  eq.  (17),  because  of 
the  constraint  m  >  d,  it  is  the  Fourier  transform  of  that  goes  to  zero 
faster  and  faster  cis  d  increases,  while  the  asymptotic  decay  of  Gm  is  always 
exponential.  Moreover,  in  eq.  (17)  A  has  to  belong  to  Ti,  while  in  eq.  (16) 
it  is  the  Fourier  transform  of  A  that  belongs  to  Li. 

6  Conclusions 

We  briefly  summarize  the  main  results  presented  in  this  paper. 

•  Let  /  be  a  function  on  and  assume  that  /  can  be  written  as  /  =  G*X, 
where  G  is  square  integrable  on  R^  and  A  is  a  signed  Radon  measure 
of  bounded  total  variation.  Then  there  is  a  linear  superposition  of  n 
translates  of  G  that  approximates  /  in  the  L2  norm  with  a  rate  of 
convergence  0(-^). 

•  Let  /  be  a  function  on  whose  Fourier  transform  hcis  compact  sup¬ 
port  and  k  continuous  derivatives,  with  k  >  Then  there  exists  a 
Gaussian  Radial  Basis  Functions  expansion  with  n  basis  functions  that 
approximates  /  in  the  L2  norm  with  a  rate  of  convergence  0('^).  The 
same  result  holds  for  approximation  in  the  sup  norm. 

•  Let  /  be  any  function  of  the  Sobolev  space  H'^’^(R‘‘},  with  m  >  d, 
m  even.  Then  there  exists  a  Radial  Basis  Functions  expansion,  whose 
basis  function  is  the  Bessel-Macdonald  kernel  Gm(x),  that  approxi¬ 
mates  /  with  a  rate  of  convergence  O(^)  in  the  norm  of  with 
|<s<m  —  |.  A  similar  rate  of  convergence  can  also  be  obtained  for 
the  approximation  in  the  sup  norm. 
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All  these  examples  involve  spaces  of  functions  with  a  number  of  deriva¬ 
tives  that  increases  with  the  dimension,  and  are  consistent  with  the  intuitive 
idea  that  spaces  of  function  in  a  high  number  of  dimensions  are  very  difficult 
to  approximate,  unless  some  constraints  are  imposed  to  prevent  their  "‘size” 
to  grow  exponentially  fast. 

One  interesting  feature  of  these  results  is  that,  thanks  to  the  constructive 
nature  of  Jones’  and  Barron’s  lemma,  an  iterative  procedure  is  provided  that 
can  achieve  that  rate.  Clearly,  these  results  concern  the  approximation  of 
a  function  /  which  is  known  everywhere,  while  in  many  practical  situations 
one  would  like  to  construct  an  approximation  of  a  function  /  knowing  only 
the  values  of  /  on  some  (finite)  set  of  points.  For  this  last  problem,  in  the 
case  of  approximation  by  sigmoidal  ridge  functions,  some  results  by  Barron 
(1992)  are  already  available,  and  show  that  also  with  this  further  source  of 
error  one  can  obtain  results  “independent  on  the  dimension”,  for  suitable 
spaces  of  functions.  It  should  be  possible  to  obtain  similar  results  for  the 
approximation  scheme  we  considered  here,  using  the  same  technique. 

Acknowledgements  We  thank  Tomaso  Poggio  for  useful  discussions  and  for  a 
critical  reading  of  the  manuscript. 

A  The  Bochner  Integral 

Let  Q  C  R'^  and  let  A  be  a  positive  measure  on  ft.  For  functions  /  :  Q  —*  X 
with  X  a  Banach  space  there  are  several  available  notions  of  measurability 
and  integration  (Dunford  and  Schwartz,  1958;  Diestel  and  Uhl,  1977).  In 
particular  for  all  (strongly)  A-measurable  functions  /  such  that  Jq  ||/||x  dX  < 
-hoo  we  can  define  the  Bochner  integral 

r  fdx .  (18) 

JQ 

Clearly  if  A  is  a  Borel  measure  the  continuous  functions  /  :  fi  -+  X  are 
(strongly)  measurable.  One  has  lemma  A.l  below  (Diestel  and  Uhl  1977, 
page  48). 

Lemma  A.l  Let  X  be  a  positive  Borel  measure  onVl  Q  RA  and  /(t)  :  f)  — »  X 
with  X  a  Banach  space.  If  f  is  Bochner  integrable  with  respect  to  X  then 
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If  one  considers  a  signed  Radon  measure  A  on  fl  one  can  still  define  the 
integral  of  a  measurable  function  f  :  Q  X  with  respect  to  A  as 


£  =  /(t)^(tMiAi(t)  (19) 

where  |A|  is  the  total  variation  of  A  and  ^  denotes  the  Radon-Nikodym 
derivative  of  A  with  respect  to  |A|.  From  lemma  (A.l)  one  can  easily  obtain: 

Lemma  A. 2  Let  X  be  a  signed  Radon  measure  onCl  C  and  f{t):Q—*X 
with  X  a  Banach  space.  If  f  is  \-measurable  and  is  such  that 

!  ll/ll  d|A|  <  +00 
Jo 

then  the  Bochner  integral  of  f  with  respect  to  A  is  well  defined  and 

(20) 

where 


5={s/(Q)  1s€R,|5|<1}  . 


In  fact  the  scalar  function  ^(t)  is  measurable,  the  function  /(t)^(t)  is 
measurable,  and  moreover 

Hence  the  integral  /q  /  dX  is  well  defined  as  the  right  member  of  (14). 
Then  by  lemma  (A.l)  applied  to  the  function  h{t)  =  /(t)^(t)  one  has: 

1  rB  fix  _ 

\mi 

On  the  other  hand  since  |^|  =  1  one  has 


and  (20)  follows. 


CO  h{Q)  =  CO  S 
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B  Sobolev  Spaces  and  the  Space  A 

Here  we  collect  a  few  facts  about  certain  spaces  of  functions  frequently  used 
in  the  paper. 

Sobolev  Spaces.  For  each  positive  integer  s  and  1  <  p  <  oo  one  defines 
the  Sobolev  Space  as  the  space  of  those  Lp  functions  in 

R'^  whose  derivatives  up  to  the  order  s  are  Lp  functions. The  space  //*•'’  is  a 
Banach  space  with  the  norm 


|a|<3 

where  a  is  a  multi-index  and  £)“  is  the  derivative  of  order  a.  The  space 
is  a  Hilbert  space  with  respect  to  the  scalar  product 


(“.<’)=  E  J^O-u  D-v 


|ot|<S 


One  has  also  the  characterization 


=  {u  €  Z/2  1  (1  +  G  ^2} 

which  can  be  used  also  to  define  the  Sobolev  spaces  for  non  integer  s. 
One  has  the  following  result,  which  is  a  special  case  of  the  Sobolev  embedding 
theorem  (Stein,  1970): 

Theorem  B.l  If  k  is  a  positive  integer  and  s  >  k  +  ^  then 
and  there  is  a  constant  Ci  such  that 


m^  sup  \D°f{x)\  <  c, ll/ll//., 2. 
N<*  *€«•' 


The  Fourier  algebra  A.  The  space  A  of  the  tempered  distributions  whose 
Fourier  transform  is  a  summable  function  is  in  current  use  in  Fourier  analysis 
(Herz,  1968;  Katznelson,  1968).  One  has 

C  A  for  fc  >  ^ 

In  fact  (Barron,  1991;  footnote)  one  may  write 
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1 


(l  +  la;nf 


(/(I  + 


where  both  factors  on  the  right  side  belong  to  L2  k  >  ^.  In  particular  it 
follows  that  Cq  C  C  i4  for  >  5. 

It  is  also  clear  that  A  d  Cq  where  Cq  is  the  completion  in  the  L^o  norm 
of  Cq  i.e.  the  space  of  continuous  bounded  functions  that  converge  to  zero 
for  ||x||  — ►  00. 
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